Speech Modelling Using Subspace and EM Techniques
نویسندگان
چکیده
Tony Robinson Cambridge University Engineering Department Cambridge CB2 IPZ England [email protected] The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be estimated using an expectation-maximisation (EM) algorithm. One problem is the initialisation of the EM algorithm. Standard initialisation schemes can lead to poor formant trajectories. But these trajectories however are important for vowel intelligibility. The aim of this paper is to investigate the suitability of subspace identification methods to initialise EM. The paper compares the subspace state space system identification (4SID) method with the EM algorithm. The 4SID and EM methods are similar in that they both estimate a state sequence (but using Kalman filters and Kalman smoothers respectively), and then estimate parameters (but using least-squares and maximum likelihood respectively). The similarity of 4SID and EM motivates the use of 4SID to initialise EM. Also, 4SID is non-iterative and requires no initialisation, whereas EM is iterative and requires initialisation. However 4SID is sub-optimal compared to EM in a probabilistic sense. During experiments on real speech, 4SID methods compare favourably with conventional initialisation techniques. They produce smoother formant trajectories, have greater frequency resolution, and produce higher likelihoods. 1 Work done while in Cambridge Engineering Dept., UK. Speech Modelling Using Subspace and EM Techniques 797
منابع مشابه
Speech Enhancement Through an Optimized Subspace Division Technique
The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...
متن کاملSpeech Enhancement Through an Optimized Subspace Division Technique
The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...
متن کاملConstrained Subspace Modelling
When performing subspace modelling of data using Principal Component Analysis (PCA) it may be desirable to constrain certain directions to be more meaningful in the context of the problem being investigated. This need arises due to the data often being approximately isotropic along the lesser principal components, making the choice of directions for these components more-or-less arbitrary. Furt...
متن کاملModelling Decision Problems Via Birkhoff Polyhedra
A compact formulation of the set of tours neither in a graph nor its complement is presented and illustrates a general methodology proposed for constructing polyhedral models of decision problems based upon permutations, projection and lifting techniques. Directed Hamilton tours on n vertex graphs are interpreted as (n-1)- permutations. Sets of extrema of Birkhoff polyhedra are mapped to tours ...
متن کاملA signal subspace approach for speech modelling and classification
In this paper, a speech classifier inspired by the signal subspace approach is developed. A novel signal subspace speech model is initially obtained via a rank reducing subspace decomposition algorithm that is based on the SVD. Motivated by the assumption that the speech signal comprises of short term dynamics that are slowly changing, it follows that the signal subspace of the speech signal is...
متن کامل